AMENDMENTS TO THE CLAIMS 

The following listing of claims will replace all prior versions and listings of claims 
in the application. 

Listing Of Claims 

1. (Currently Amended) A system for tuning the text-to-speech conversion 
process, the system comprising: 

a text-to-speech engine, said text-to-speech engine receiving at least one text- 
input and converting said text-input into a processed representation, 

said processed representation including at least one speech feature associated 
with at least one segment of said representation; and 

a visual editing interface, said visual editing interface displaying said processed 
representation using at least one graphical indicator on an output device, wherein said 
segment is displayed on said output device using said graphical indicator corresponding 
to said speech feature, said speech feature is at least one of boundary strength or 
pause duration, and said graphical indicator is at least one of a display property of a 
boundary between adjacent segments or spacing between textual contents of the 
adjacent segments th e r e by commun i c a ting var i ations i n on e or mor e typ e s of sp ee ch 
featur e s associat e d w i th s e gm e nts of sa i d r e pr e s e ntat i on by vary i ng v i sual display 
prop e rti e s of the s egments. 
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2. (Previously Presented) The system of Claim 1 wherein said visual editing 
interface provides at least one editing function to a user, the editing function enabling 
modification of said speech feature associated with said segment through a change in 
the corresponding said graphical indicator. 

3. (Original) The system of Claim 2 wherein said visual editing interface 
associates said speech feature corresponding to said segment with said graphical 
indicator, wherein the user's modification of said graphical indicator results in a 
corresponding change in said speech feature of said segment. 

4. (Original) The system of Claim 1 wherein said speech feature is at least 
one of the following: normalized text, part-of-speech, parsing of text, chunking of text, 
boundary strength, pause duration, transcription, speech rate, syllable duration, 
segment duration, pitch, word prominence, emphasis, formant mixing mode, unit 
selection override, intensity contour, formant trajectories, and allophone rules. 

5. (Original) The system of Claim 1 wherein said graphical indicator 
comprises at least one of the following: graphical style, font faces, coloring, vertical 
spacing, horizontal spacing, italicization, boldness, underlining, blinking, crossing-out, 
text orientation, text rotation, punctuation symbols and graphical symbols. 

6. (Original) The system of Claim 1 wherein said processed representation 
employs a parameterized aligned sound records format. 
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7. (Original) The system of Claim 1 wherein said segment comprises at least 
one of the following: word, letter, syllable, pause, word boundary and punctuation-mark. 

8. (Original) The system of Claim 1 wherein said visual editing interface 
operates as a plug-in for a graphical user interface. 

9. (Original) The system of Claim 8 wherein said plug-in is an ActiveX 
control. 

10. (Previously Presented) The system of Claim 1 wherein said visual editing 
interface allows definition of said input-text by providing a set of text messages 
containing non-editable text and editable blank slots into which at least part of said 
input-text can be entered. 

1 1 . (Original) The system of Claim 1 wherein said visual editing interface is 
language independent. 

12. (Original) The system of Claim 1 wherein said visual editing interface 
provides the user with speech audio output of said processed representation. 

13. (Original) The system of Claim 1 wherein visual editing interface is 
connected to a data-store for storing and retrieving said representation. 
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14. (Previously Presented) The system of Claim 1 wherein said processed 
representation is a modified textual representation of the processed input-text. 

15. (Previously Presented) The system of Claim 14 wherein the input text is 
used to generate said processed representation. 

16. (Previously Presented) The system of Claim 15 wherein said modified 
textual representation is stored and accessed from a data store. 

17. (Previously Presented) The system of Claim 14 wherein said modified 
textual representation is used to generate synthesized speech using a TTS system 
distinct from said text-to-speech engine. 
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1 8. (Currently Amended) A system for providing a text-to-speech interface, the 
system comprising: 

a visual interface connected to a text-to-speech engine; and 
at least one communication channel connecting said visual interface to said text- 
to-speech engine, said text-to-speech engine communicating with said visual interface 
over said communication channel by sending and receiving at least one data segment 
in a format, 

wherein said visual interface communicates variations in one or more types of 
speech features associated with segments of said data by varying visual display 
properties of the segments, at least one of said speech features is at least one of 
boundary strength or pause duration, and said Visual display properties are applied to at 
least one of a boundary between adjacent segments or spacing between textual 
contents of the adjacent segments . 

19. (Original) The system of claim 18 wherein said format of said data 
segment is a parameterized aligned sound records format. 

20. (Original) The system of claim 18 wherein said text-to-speech engine 
sends said data segment in the parameterized aligned sound records format to said 
visual interface, said visual interface rendering said data segment in a visual form, said 
visual interface allowing editing of said data segment to produce an edited data 
segment, said visual interface sending said edited data segment to said text-to-speech 
engine. 
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21 . (Original) The system of claim 18 wherein said visual interface sends data 
to said text-to-speech engine over a first said communication channel and said text-to- 
speech engine sends data to said visual interface over a second said communication 
channel. 

22. (Currently Amended) A method for visual tuning text-to-speech conversion 
process, the method comprising: 

converting an input-text to a processed representation using a text-to-speech 
engine, said processed representation including at least one speech feature of said 
input-text; 

displaying said processed representation on a visual editing interface connected 
to said text-to-speech engine, said speech feature of said processed representation 
being displayed in a corresponding graphical form :, thereby 

communicating variations in one or more types of speech features associated 
with segments of said representation by varying visual display properties of the 
segments, wherein said speech features include at least one of boundary strength or 
pause duration, and said visual display properties are applied to at least one of a 
boundary between adjacent segments or spacing between textual contents of the 
adjacent segments ; and 

providing an editing function in said visual editing interface to a user for modifying 
said speech feature in said graphical form. 
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23. (Original) The method of Claim 22 further comprising: 

generating speech audio equivalent of said processed representation through 
said visual editing interface. 

24. (Original) The method of Claim 22 further comprising: 
saving said processed representation in a data store; and 

loading said processed representation stored in said data store into said visual 
editing interface. 

25. (Previously Presented) The method of Claim 22 further comprising: 
converting said processed representation into a modified textual representation 

of the processed input-text. 

26. (Currently Amended) The method of Claim 25 further comprising: 
converting said textual representation into a processed representation^.]}, 

wherein the input text is used to generate said processed representation. 

27. (Previously Presented) The method of Claim 25 further comprising: 
storing said modified textual representation in a data store; and 

loading said modified textual representation stored in said data store into said 
visual editing interface. 
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28. (Previously Presented) The method of Claim 25 further comprising: 
using said modified textual representation to synthesize speech using a TTS 

system distinct from said text-to-speech engine. 

29. (Previously Presented) The system of claim 1 , wherein said visual editing 
interface displays a modified textual representation of said text-input, and variations in 
visual display for communicating different speech features individually associated with 
different textual segments of the textual representation include a combination of at least 
two of: (a) variations in graphical length of the textual segments; (b) variations in 
vertical positions of the textual segments; (c) variations in horizontal spacing of the 
textual segments; (d) variations in font faces of the textual segments; (e) variations in 
coloring of the textual segments; (f) variations in styles of the textual segments; (g) 
variations in orientation of the textual segments; (h) variations in rotation of the textual 
segments; or (i) punctuation of the textual segments. 
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